Embed
Email

arXiv0901.1095v1 [cs.CR] 8 Jan 2009

Document Sample

Shared by: yurtgc548
Categories
Tags
Stats
views:
0
posted:
12/11/2011
language:
pages:
16
FAIR: Fuzzy-based Aggregation providing In-network Resilience for

real-time Wireless Sensor Networks ∗



Emiliano De Cristofaro1 , Jens-Matthias Bohli2 , Dirk Westhoff2

1

Information and Computer Science - University of California, Irvine

Irvine, CA, 92617

edecrist@ics.uci.edu

2

NEC Laboratories Europe

Kurfursten Anlage 36

69115 Heidelberg, Germany

{bohli, dirk.westhoff}@nw.neclab.eu

arXiv:0901.1095v1 [cs.CR] 8 Jan 2009









Abstract. This work introduces FAIR, a novel framework for Fuzzy-based Aggregation providing In-network

Resilience for Wireless Sensor Networks. FAIR addresses the possibility of malicious aggregator nodes manipu-

lating data. It provides data-integrity based on a trust level of the WSN response and it tolerates link or node

failures. Compared to available solutions, it offers a general aggregation model and makes the trust level visible

to the querier. We classify the proposed approach as complementary to protocols ensuring resilience against

sensor leaf nodes providing faulty data. Thanks to our flexible resilient framework and due to the use of Fuzzy

Inference Schemes, we achieve promising results within a short design cycle.





1 Introduction



Wireless sensor networks (WSN) are increasingly being used to monitor physical conditions. For an energy-efficient

information flow, data aggregation is a core feature of medium and large scale sensor networks. Not only may it

be unnecessary to collect the sensed data from each sensor, but it could also result in the nodes’ energy rapidly

exhausting, due to the high communication load. As a result, most WSNs perform data aggregation – nodes process

incoming messages and aggregate the information according to a given aggregation function, e.g. average, sum,

minimum/maximum. A subset of nodes, called aggregator nodes, build a backbone and are responsible for the data

stream. The aggregator nodes may be dynamically selected to balance the load between all nodes.

In this work, we focus on sensor networks that consist of restricted low-cost devices, respond in real-time and

possibly without a fixed base station. Nevertheless, the data provided by these wireless sensor networks may have

impact on the real world, especially if the data is used to control actuators. For instance, accident prevention could

be achieved by sensing and providing road conditions: the driver or even the car automatically react on the sensor

information at real-time [8]. Moreover, the application of WSNs to SCADA systems could support the protection

of critical infrastructures, such as power plants, power distribution networks or oil and gas pipelines [4]. In such

settings, we claim the main security goal to be resilience, which we characterize by: (i) providing aggregated data

integrity in the presence of bogus nodes altering the aggregation process, (ii) enforcing robust protocols in presence

of likely node or message failures, (iii) providing the querier with a measure on the accuracy of the aggregated value.

We present FAIR, a novel resilient aggregation framework that provides data-integrity, tolerates link or node

failures and node misbehavior, and returns a quality measurement of the aggregated value. We argue that the

quality measurement is an essential part for robust and secure aggregation. Since failure tolerance can help the

adversary to remain undetected with attacks, the integrity protection of the sensor information may be weakened.

Therefore, we introduce a quality measure that makes the trust level of the WSN response visible to the querier.

Previous work often concentrates on one protection goal: solutions for (i) are [42,13]. Very recently, protocols that

provide integrity and robustness are considered [23,38]. Algorithms that concentrate only on robustness – a solution

for (ii) – are [31,29,14,19]. Also recently, robust protocols consider in addition security against corrupted aggregator

nodes [20]. Quality of information – as in (iii) – has also been used for data aggregation, e.g. in database systems [32].

WSN solutions providing quality measures are still very rare [9,26]. We are not aware of any protocol that present a

solution for (i), (ii) and (iii). More details on related work are provided in Section 6.



This paper is an extended version of the paper appeared in the 2nd ACM Conference on Wireless Network Security

(WiSec’09).

The FAIR framework uses witness nodes as introduced in [16]. However, FAIR goes one step further and uses

multiple witness nodes which not only confirm the result, but aggregate and forward the results themselves. This

produces redundancy that can help the protocol to improve data integrity and robustness. Nevertheless, usefully and

efficiently utilizing this redundancy is not a trivial task. To this aim, we note that the analysis of all the information

available at the aggregator node can be used to infer an estimation on the quality of the aggregated information

and choose those witness nodes that provide the highest value. We introduce a value, the quality of information, to

assess the degree of trust on the aggregated value. More details on our approach is given in Section 3. Our technique

is based on the concepts of Fuzzy Inference Schemes (from now on, FIS). Such techniques are known to be effective

in making real time decisions using incomplete information [18,21,40]. We claim that FIS are suitable in representing

a measure on the quality of information, by means of natural linguistic rules, over input values which cannot be

predicted. Moreover, we show that the FIS also suit to handle bogus input values up to a certain degree. We present

our Fuzzy Inference Scheme in Section 4 and give an insight on Fuzzy Logic in Appendix A.

We remark that the computation of the quality of information grants a nice plus at the end of the aggregation

process, providing the querier with the final measure on the accuracy of the aggregation process. According to the

quality of information value the querier decides whether to accept or to decline the result of the query. Note that such

a hint on the correctness of the received aggregated value, although still fuzzy, is useful compared to receiving an

aggregated value without any quality inspection. Indeed, we show in Section 5 that the proposed framework achieves

to improve resilience and provides the querier with a meaningful measure of response accuracy.





2 Problem Model



Next, we present an overview about the network assumptions and adversarial models in our work.

Network Assumptions. The sensor network consists of n sensor nodes Ni , partitioned into clusters. All sensors

Ni have a unique identifier ID i . All nodes in a cluster are supposed to be in single-hop communication distance to

each other. The clusters do not necessarily have the same shape or size.

The network’s functionality is to collect and to provide data. In any epoch, the querier device may query the

sensor network for data from this epoch. The owner of the WSN is typically not interested in the entirety of sensor

readings, but rather in an application-specific evaluation of the sensor results. We denote the function that derives

the result from the sensor readings by f (v1 , . . . , vn ) where vi is the sensor reading of sensor Ni in this epoch. To

allow data processing within the WSN, it must be possible to decompose the function f into local sub functions, e.g.

f (v1 , . . . , v13 ) = f0 (f1 (f2 (v1 , . . . , v3 ), f2 (v4 , v5 ), f2 (v6 , v7 )), f1 (f2 (v8 , . . . , v11 ), f2 (v12 , v13 ))). In the rest of this work,

we solely assume aggregation functions holding this property. The following protocols are assumed to be in place:



– A routing protocol, e.g. tinyLUNAR [33]. This provides multi-hop communication between sensor nodes. To avoid

that forwarding nodes can modify or block messages, a multi-path message propagation is advisable.

– A protocol for authenticated communication between nodes. Nodes have to be able to authenticate messages

to other nodes, namely to their aggregator nodes. This protocol could build on the multicast authentication

introduced by Canetti et al. [12].

– An aggregator node election protocol. This takes a set of sensor nodes {N1 , . . . , Nn } and the current epoch as

input and outputs a random sequence of aggregator nodes (Agg 1 , . . . , Aggk ) = (Ni1 , . . . , Nik ), which are then

responsible for the aggregation of data of {N1 , . . . , Nn }. PANEL [10], LEACH [24], or SANE [37] ensure this. If the

node election protocol has to be non-manipulable (e.g. fully deterministic), SANE should be used. Furthermore,

the protocol needs to be predictable, meaning that all inputs are public, such that anyone can compute the

aggregator nodes for any epoch and any set of nodes.



Adversarial Model and Security Requirements. We assume the adversary is in complete control of the wireless

channel. The attacker can eavesdrop data over the wireless broadcast medium or control the communication channel

to catch, destroy, modify and send data. As we aim at implementing the protocols on the low-end price class of

physically unprotected sensor nodes, the adversary is assumed to be able to take control over several sensor nodes.

As a security assumption, a threshold on the number of corrupted nodes will be assumed. If the adversary controls

a sensor node, she gains knowledge of all the sensitive information stored at this node. In our security evaluations,

ıve ıve

we will classify between na¨ attacker and smart attackers. The na¨ attacker operates by means of randomly

distributed corrupted nodes and does not perform any kind of collusion or collaboration within the compromised

nodes. This adversarial strategy covers a realistic scenario where an attacker manages to cause malfunctions in nodes

but does not have a powerful communication equipment in order to let the compromised nodes communicate and





2

collude in an undetectable manner. The attacker’s strategy is to perform as much damage as possible during the

aggregation process without being discovered by the user. The smart attacker’s goal is the same as the na¨ – ıve

altering the aggregation process without being discovered by the user. However, this attacker is more powerful, in the

sense that she cooperatively operates to make the querier accept false data. Compromised witness nodes communicate

and collude in order to report the same bogus aggregated value. Moreover, the attacker has some knowledge on the

network topology. Thus, she tries to focus the attack on nodes at the same aggregation level, such that witness nodes

from the same level can collude. This adversarial strategy covers the worst scenario for our framework.

As formerly discussed, the primary security goals are three. First, we want to protect data integrity and prevent

the stealthy attack [34] that makes a querier accept a flawed aggregated value. Then, the framework should also be

robust, which means the adversary cannot prevent the network from providing a response as long as the number

of corrupted nodes and blocked channels is limited. With an increasing number of adversarial interaction, a higher

probability of larger deviation of the aggregated value is tolerated and in fact not avoidable. However, the expected

deviation should ideally be expressed by a quality of information value (see Section 3.4). Finally, we want to provide

the querier with a measure on the accuracy of the entire aggregation process.





3 The FAIR Architecture



Witness nodes have been often employed to confirm the result of aggregator nodes, in order to ensure the integrity

of data during aggregation. For instance, Du et al. [16] use a witness node that gets the same input as the aggregator

node, however, without forwarding the result. Instead, the witness computes a MAC for the result and forwards

the MAC to the aggregator node. The aggregator node collects the MACs from the witness nodes and forwards the

aggregation result and all MACs to the base station. This approach has been extended to a multi-level aggregator

hierarchy in [8]. In a multi-level hierarchy, the confirmation of the witnesses in lower-levels have to be checked within

the network. This requires a multicast authentication protocol as outlined in Section 2.

On the contrary, we propose a protocol where several witness nodes not only confirm the aggregator’s result,

but aggregate and forward the result themselves. Thus, we break down the distinction between the aggregator and

witness nodes and will only use the term aggregator nodes (from now on, AN). This results in more robustness and

flexibility: there is no dependency on a single AN. Furthermore, the aggregator nodes on a higher level receive the full

data and extract information even if the nodes disagree. Figure 1 shows exemplarily an aggregator hierarchy with

two aggregating nodes per cluster. All sensor nodes are physically located on the sensor node level, partitioned into

clusters. The tree shows a logical aggregator hierarchy. Every node on an in-network aggregation level symbolizes an

AN that is elected out of one of its descendants. Thus, it is not a new node, but one of the leaf nodes that fulfills an

additional role.





Trusted base station or reader device







Top in-network aggregation level (sink nodes)







In-network aggregation level 1







Sensor aggregation level







Sensor nodes

in clusters



Fig. 1. Aggregator hierarchy with two aggregators per cluster. The dashed lines show the communication flow that

is caused by the additional aggregator or witness nodes, respectively.







3

3.1 Redundancy

We let the number of aggregators be configurable. This allows the querier to trade resilience for efficiency. As outlined

in Section 2, the aggregator node election protocol outputs a sequence of w potential aggregator nodes. Every node in

the WSN can potentially act as aggregator node. This comes at the cost of a reduced protection of the data integrity:

from the aggregated result, it is not visible if errors or disputes occurred during the aggregation process. However,

this gives the flexibility to aggregate via additional nodes in case the assigned aggregator nodes fail or do not reach

an agreement. This flexibility helps the protocol robustness, because any aggregator node that fails can be replaced.



3.2 Quality of Information

To make anomalies that occurred in the aggregation process visible to the querier, we propose to distributedly

compute a value, which we term as Quality of Information (QoI). The QoI conveys the level of trust that can be

placed in the aggregated value. The QoI value allows the querier to assess the confidence of the aggregated sensor

data. Moreover, the QoI enables an AN to gain the most benefit from the redundancy introduced by the witness

nodes. We expect that choosing those witness nodes that provide the highest value will result in a higher accuracy

of the aggregation process, despite the presence of adversaries or errors. Certainly, this value is again exposed to

attacks and cannot generally be trusted. Here again, the witness architecture is used to validate the QoI value, i.e. a

single aggregator that manipulates the QoI gets caught if the responses of further ANs of this subtree are considered.

In principle, it can be distinguished between two origins of quality loss. 1) As discussed above, the resilience

introduced to the aggregation process makes a quality processing necessary to account for the increased susceptibility

to attacks. 2) Another aspect of QoI arises immediately in the processing of the sensor data on the sensor aggregation

level. Here, e.g., the intensity of the event detection or the number of participating sensors can constitute the QoI.

The work at hand purely concentrates on the first issue, the processing of the QoI due to anomalies in the aggregation

process.

The quality of information will be computed in the data aggregation process. Aggregators on the sensor aggre-

gation level will come up with an initial quality of information value depending on the completeness of sensor data

in their cluster. Aggregators in the first in-network aggregation level are the first to start the anomaly based QoI

evaluation. Aggregators on higher aggregation levels will continue processing the QoI. The QoI value will be reported

to the querier by the sink nodes along with the aggregated WSN information. In Section 4, we will present our

implementation computing QoI by means of a Fuzzy Inference Scheme (FIS).

Figure 2 gives a bird’s view on the FAIR architecture and highlights its relevant building blocks. An in-network

aggregator node, AN is responsible for aggregating messages, e.g., from the subtrees S1 , S2 , S3 , where each Si is the

set of sensor nodes at the leafs of the subtree i. It receives input data and QoI values from its children and outputs the

result of the aggregation and the new QoI value. As presented in Section 3.1, all the nodes in the same subtree should

ideally report to AN the same data and QoI values. In reality, however, the presence of attackers or link failures will

instead introduce variances. We remark that the use of multiple witness nodes to aggregate the values from the same

subset of nodes introduces some redundancy which can now help the protocol resilience. Our idea is to use the QoI

value to optimize the selection of nodes within the same subtree from which to get the data. This step is performed

by the FAIR inference module, which we introduce in Section 3.3.

We observe that in FAIR, the aggregation algorithm is performed in two steps.

1. The messages belonging to the same subtree are evaluated and filtered by the FAIR inference module. Its output

is a value vi expressing the (aggregated) sensed values of subtree i, together with an associated QoI value for the

considered subtree in the aggregation hierarchy. The inference module contains the intelligence of the resilient

data aggregation protocol, and we will suggest an implementation of this algorithm based on fuzzy logic. We

stress that the logic of this module is independent from the concrete aggregation function. More details on this

will be shown in Section 5.

2. After the FAIR inference module has been fired, the algorithm aggregate performs the actual aggregation function

over the results of different subtrees. This algorithm implements a decomposable aggregation function as outlined

in Section 2. Moreover, the different QoI values (which have been independently computed in the FAIR inference

modules) for each subtree will be averaged to express a unique QoI output as for all the subtrees.



3.3 Fuzzy-Inferred QoI

In the following, we look in more detail to the design space of the FAIR inference module. This model harmonizes

values received from aggregators within one subtree. The input are data values and QoI estimations of aggregator





4

Fig. 2. Building blocks of a single aggregator node within the FAIR architecture.







nodes in a single subtree. As all those aggregator nodes are witnesses to each other, these values are identical in the

absence of link failures and attacks.

Figure 3 shows the details of the FAIR inference module. In the Figure, we suppose that for each subtree AN

is reported with data from 5 witness nodes, but it will consider only 3 of them. However, those parameters are

configurable in our implementation. The selection is made by the inference module. Therefore, the inference module

has a twofold goal: (i) employing the witness nodes redundancy to use the best selection of nodes from which to get

data, and, (ii) employing the QoI to guide the node selection. The inference module consists out of two components:

a fuzzy inference module, used to compute the QoI and a filter, used to select the witness nodes whose data will be

further considered in the aggregation process. At the beginning, the filter will only select the values received via the

radio interface. The fuzzy inference module computes the QoI over those inputs and, if this value is above a threshold,

then those values will be averaged and the result will be directed to the aggregation function. The computed QoI

value serves to estimate the quality of the subtree. However, if the computed QoI value is below the threshold, the

node with lowest QoI will be discarded and another node will be considered. This process iterates until the resulting

QoI value is above the threshold. If the threshold has not been reached but no further node is available, the process

selects the combination of nodes resulting in the highest QoI value.

It is not straightforward to determine on which basis to compute an effective QoI value. In fact, there is no clear

indication of ongoing attacks. However, an AN can infer some evaluation on the quality of the aggregated data, by

analyzing:

1. Consistency - the degree of agreement of lower-level aggregators in the same subtree on the data. A significant

deviation indicates the presence of errors or lower-level ANs reporting bogus information.

2. Completeness - each AN is aware of the nodes that are expected to send a message (namely, the nodes of the

respective subtree Si ). If messages of some of the nodes are missing, this could potentially point to link failures

or node exhaustion or to an ongoing attack. Also the cases where the message is corrupted or the QoI is so low

that the information should actually be ignored can be treated as incomplete information.

3. QoI from the lower level - This expresses the QoI evaluation over the data to be aggregated and must be taken

into account when computing the current QoI value.

We argue that FISs are beneficial for the FAIR framework. Indeed, FISs are effective in making real time decisions

with incomplete information. A potential alternative solution could be to use probabilistic uncertainty models for

describing the quality of information. However, those models need significant statistical information to extract prob-

ability distributions, but, within this context, there is no such information other than a few vague data. To this aim,

we argue that a FIS may provide a possibility to assess QoI by means of natural linguistic terms and rules.

However, we want to point out that the use of a FIS in the FAIR inference module to compute the quality

information needs to be defeated against some negative arguments:

– no setpoint exists to train the module - this holds except for the ideal case that no attacker is present and all

data are transmitted reliably, for which we should have perfect QoI;





5

Fig. 3. The FAIR inference module.







– no feedback loop exists - generally fuzzy controllers continuously fire rules with the aim to stabilize a process by

comparing its actual output value with the setpoint; in our setting for each aggregation process several fuzzy

inference module fire once without any feedback loop for a given setpoint;

– distributed - typical fuzzy controllers are centralized whereas our scheme is highly decentralized, i.e. the same

fuzzy inference module instantiation is located on each aggregator nodes firing probably different rules with

different actual input values for QoI, Completeness and Consistency;

– bogus data - Fuzzy controllers are usually applied to control loops where the actual status of the process differs

from the setpoint due to environmental influences and/or overmodulation of the actual output value of the

controller, and to due to adversaries;





3.4 Remarks on the interpretation of QoI



At the end of the aggregation process, the sink node will report the querier both the final aggregated value and the

final QoI value, which would ideally express the degree of trust which can be put on the response accuracy. For this

value, we propose the following semantic:



 do not use if QoIout ≤ 0.5

out

QoI only in non-sensitive settings if 0.5 < QoIout ≤ 0.8

use with high confidence otherwise





Such a coarse granular classification is easily understandable and therefore applicable by the querier. However,

we want to point out some observations with respect to this classification:

For a QoIout belonging to the category ’use with high confidence’ the results may have different causes:



1. The system is running in the absence of any bogus modifications without considerable link failures.

2. Some minor malicious nodes may have incorporated bogus data. However, these data with a high probability

have been overruled by the FAIR distributed fuzzy inference module; this is true for the naive attacker as long

as the corrupted nodes are relatively remote from the sink node.





6

3. A majority of bogus nodes from the same subtree agrees on the same wrong value. According to the concrete

level in the aggregation hierarchy, the impact is different.

As a result, in case the QoIout is ranked as ’use with high confidence’ although there is still some uncertainty that

bogus data have effected the output, we will see that to a high degree one can trust the aggregated value.

Vice versa, for a QoIout belonging to the category ’do not use’ the results are out of the following reasons:

1. There was no attack; however, for one or the other reason not enough nodes contributed.

2. There was an attack and/or not enough nodes contributed such that the aggregation process is not presentable.

To conclude, in the case QoIout is ranked as ’do not use’ both subcases clearly indicate that one should not use the

aggregated node. Vice versa, ’use with high confidence’ means that data can be trusted to a high degree. Here only

case 3 could cause the querier to accept a bogus aggregated value despite a high QoI, i.e. there is a colluding attacker.





4 Fuzzy Inference Module









Fig. 4. Fuzzy Sets describing the Fuzzy Input and Output Variables.









4.1 Design

At each aggregation step, a measure on the quality of information is computed by the aggregator nodes to assess

the level of trust that can be placed on the aggregated value. Our approach is based on Fuzzy Inference Schemes

(FIS). Over the last years, the research community has proved that such schemes are very handy for making real time

decisions with incomplete information. Standard control systems in fact need to rely on an accurate representation

of the environment. By contrast, FISs are suitable in representing a measure on the quality of information, by means

of natural linguistic rules, over input values which are imprecise and potentially bogus.

Recall that our lightweight implementation neither requires any change to the network topology nor does it require

any central authority to execute the fuzzy inference module. In fact, each aggregator node runs the same instance of

the controller with its local inputs.

Fuzzy Variables. At each step, an AN computes the QoI using these three fuzzy variables:

1. Completeness - the fraction of children nodes responding to the query.

2. Consistency - normalized standard deviation on data received from children nodes.

3. QoI- the QoI values received from the children nodes.

Fuzzy Sets. The linguistic variables used to represent the fuzzy variables are intuitively and simple. We dis-

tinguish among three attributes: small, medium, and high. However, for each fuzzy variable those attribute have a

different meaning and may have a slightly different shape according to our empirical experience. Figure 4 shows the

different fuzzy sets for the three input fuzzy variables and for the fuzzy output.

Fuzzy Rules. The core part of every fuzzy controller is the choice of the set of rules. In a first stage, we define a

very limited set of disjoint rules to mirror the straightforward expectation on the evolution of quality of information

values over the aggregation process. We define the following set of rules:

(a) if (QoI is small and Consistency is small and Completeness is small) then QoIout is small





7

(b) if (QoI is medium and Consistency is medium and Completeness is medium) then QoIout is medium

(c) if (QoI is high and Consistency is high and Completeness is high) then QoIout is high

Subsequently, we iteratively refine our set of rules in order to capture a greater number of possible scenarios. In fact,

the above rules do not catch the conditions where the inputs have different fuzzy attributes. First, we remark that

the quality of information value should be computed as small not only when all the three inputs are small. In the

spirit of a conservative risk assessment, one small input value represents with high probability an ongoing attack.

Therefore, we skip rule (a) and add the following rules:

(1) if QoI is small then QoIout is small

(2) if Completeness is small then QoIout is small

(3) if Consistency is small then QoIout is small

Furthermore, we note that ideally the quality of information should be computed as high only when all the three

input values are high. As a result, the following rules are added to the rule set to have the output to medium when

at least one input value is medium:

(4) if (QoI is medium and Consistency is medium and Completeness is high) then QoIout is medium

(5) if (QoI is medium and Consistency is high and Completeness is medium) then QoIout is medium

(6) if (QoI is medium and Consistency is high and Completeness is high) then QoIout is medium

(7) if (QoI is high and Consistency is medium and Completeness is medium) then QoIout is medium

(8) if (QoI is high and Consistency is medium and Completeness is high) then QoIout is medium

(9) if (QoI is high and Consistency is high and Completeness is medium) then QoIout is medium

Moreover, we remark that having all three inputs as medium mirrors w.h.p. an ongoing attack as well as the impos-

sibility for the fuzzy inference module to recover the aggregation process. Hence, we substitute rule (b) with:

(10) if (QoI is medium and Consistency is medium and Completeness is medium) then QoIout is small

Finally, we save rules (c):

(11) if (QoI is high and Consistency is high and Completeness is high) then QoIout is high

Finally, we choose the min-max inference scheme and the weighted average method for the defuzzification. For more

details, we refer to Appendix A.

We remark that the FAIR inference module is deployed into each AN performing in-network aggregation (see

Figure 1). In fact, at the sensor aggregation level, the only information available to the aggregator nodes is the

Completeness. These nodes do not receive any quality of information from the child nodes and they cannot compute

any Consistency measure from the sensing nodes.



4.2 Implementation

We distinguish two different stages of the implementation of the fuzzy inference module: (i) testing and (ii) deploy-

ment.

The testing phase is essentially needed to test the behavior of the module in different network’s and adversarial

settings. During this phase, we work in a Java environment, enhanced by the JGraphT library [1] and the FuzzyJ

toolkit [6]. The JGraphT package is a free Java graph library that provides mathematical graph-theory objects and

algorithms. We use this package to simulate a sensor network through an undirected graph. Sensor nodes are simulated

by nodes in the graph. To represent two nodes within their transmission range, an edge is inserted in the graph.

The FuzzyJ toolkit instead is a set of Java classes that provide the capability for handling fuzzy concepts and fuzzy

inference based on fuzzy logic. We use this package to implement the fuzzy inference module. At this stage, we design

a simple and flexible network model and run simulations in order to tune the fuzzy sets and rules, as for the rules

and the fuzzy sets.

In a second stage, we focus on how to deploy our framework on real sensor nodes, e.g. Mica2/MicaZ. We are

aware that sensor nodes are tiny devices with limited memory, computational power, and battery life. As a result,

the introduction of the Fuzzy Controller has to be carefully designed. It is in fact unfeasible to embed a complex

library in a sensor node to implement the Controller. Therefore, we make use of the JFS toolkit [30]. JFS is a

development environment for the programming language JFL. The environment includes tools to compile, run,

improve and convert JFL programs. JFL is a special-purpose language used to write functions. It combines features





8

from traditional programming languages like (C, Pascal, Basic, etc.) with fuzzy logic and machine learning. The

most important feature of this toolkit for our goal is the ability to convert the JFL code to a source code file in

standard ANSI C. The resulting source code does not need any additional library or package, but it is ready to run.

Furthermore, the resulting algorithm does not require any relevant computational power, since in practice the fuzzy

inference module will only make a set of function evaluations and cascaded if-then-else statements. Having standard

and simple C code allows the deployment on a large set of sensor network platforms. Indeed, the most diffused

types of sensor nodes, i.e. Mica motes run the TinyOS operating system [7]. This allows to easily deploy algorithms

written in the nesC programming language [3], an extension to the C programming language designed to embody

the structuring concepts and execution model of TinyOS. Hence, our final step is to use the JFS toolkit to generate

a standard C source code for our fuzzy inference module. Then, porting the C code to nesC is straightforward and

we can finally deploy our fuzzy-driven aggregation mechanism on Mica motes.





5 Security and Overhead Analysis

In this section, we evaluate our framework. We want to show that the use of Fuzzy-inferred Quality of Information

improves the accuracy of the final aggregated value, (i) in presence of bogus nodes altering the aggregation process, (ii)

in presence of node or link failures. Moreover, (iii) we show that our framework provides the querier with a reasonably

accurate estimation of the quality of the query process, so that the querier has a rationale to decide whether to accept

the result or not. In this section, we show that the use of the fuzzy inference module remarkably increases the accuracy

of the aggregation process, without performing any heavy operation or increasing the communication load. Indeed,

the use of the fuzzy inference module embedded into the FAIR architecture helps the framework improving the choice

of the witness nodes used to compute the aggregation function. The presence of bogus ANs results in fact in lower

Consistency values, as well as the failure of some links results in lower Completeness values, and thus lower QoI for

the subtree. This would lead upper level ANs to try to discard the aggregated value coming from that subtree in

favor of subtrees with higher QoI. On the contrary, without using the fuzzy inference module, the AN would have no

way to perform this choice.

To confirm our analysis, we perform an simulation analysis using the Java testing environment presented in Section

ıve

4.2. As presented in Section 2, we distinguish between two types of attackers: (i) the na¨ attacker compromises

bogus nodes randomly distributed within the network, and (ii) the smart attacker coordinates colluding bogus nodes.

Network topology. We used a reasonable average sized sensor network, composed by 100 sensing nodes. We

choose an average cluster size of 12 nodes and we set ANs at the in-network aggregation levels to aggregate from an

average of 3 lower level ANs. Finally, we choose an average number of 4 witness nodes for each AN. As a result, we

have 3 aggregation levels, plus the final one at the sink.

Finally, we present an overview on the energy and computation overhead needed by our framework.



5.1 ıve

Na¨ Attacker

In a first stage, we evaluate the behavior of our framework in presence of bogus ANs, assuming neither node nor link

failures. We let a bogus node report a value randomly ranging from the 10% to the 90% of the correct value. The

chosen aggregation function is the average. Figure 5(a) shows a remarkable increase in the accuracy thanks to the

use of the fuzzy inference module, enabling every AN to improve the choice of witness nodes.

Subsequently, we consider the case where, besides bogus ANs, a percentage of links fail. This could happen for

two possible reasons: (i) the ANs has prematurely run out of battery and is unable to receive/transmit messages, (ii)

an adversary is occupying the node and preventing some or all message transmissions. For this experiment, we use

the same network configuration. However, in this case instead of bogus ANs, we simulate a ranging number of link

failures. Link failures themselves do not falsify the result. The lower quality is justified by the reduced redundancy

that facilitates attacks. Therefore, we simulate the settings where both link failures and a fixed 10% of bogus nodes

occur. Figure 5(b) confirms a remarkable gain in the accuracy thanks to the use of the the fuzzy inference module

ıve

also with this setting. Next, we show that in case of a na¨ attack the final QoI value received by the querier is

effectively providing a reasonable measurement on the degree of trust to give the aggregation process, according to

the semantic introduced in Section 4. For this reason, we run some simulations showing that w.h.p. the QoIout value

corresponds to the degree of accuracy of the final aggregated value. Once again, the average aggregation function

has been chosen. Figure 5(c) shows that the accuracy of the aggregation process grows linearly with the QoI value.

Furthermore, it shows that to each class of QoI values (do not use, use only in non-sensitive settings, use with high

ıve

confidence) corresponds the right degree of query accuracy. We conclude that in the presence of purely na¨ attackers,





9

(a) (b) (c)



ıve

Fig. 5. Na¨ adversary









(a) (b) (c)



Fig. 6. Smart adversary without knowledge of the topology









(a) (b) (c)



Fig. 7. Smart adversary with topology knowledge









10

our QoI-based metric can be effectively used to evaluate the entire aggregation process at real time, without the need

of any post-aggregation communication, such as commitment disclosure (as in [42,13]).



5.2 Smart Attacker

ıve

We perform the same set of experiments as for the na¨ attacker with the same topology, and with the same

aggregation function, i.e. the average. In a first stage, we let the adversary collude, i.e. report the same bogus value

when acting as a witness for the same subtree. All the adversaries report the 50% of the correct aggregate value.

Figure 6(a) focuses on the presence of bogus ANs, while Figure 6(b) on the presence of 10% of bogus ANs and an

increasing number of link failures. The gain obtained by the use of the FAIR inference module is less in comparison

ıve

with the na¨ adversary, but still significant. Finally, Figure 6(c) gives the idea to how degree the QoI reflects reality

in presence of smart adversary. As we can see, after about 30% of node compromission, the inference module is fooled

by the collusion of the adversaries. In fact, the QoI value begins to increase instead of decreasing w.r.t. the presence

of bogus adversaries. As a result, we remark that the FAIR framework can be considered resilient in presence of no

more than approximately 30% of compromised nodes.

In a second step, we test the situation where the adversary is also given information on the topology, such that

she can focus on attacking nodes at the same level, besides colluding to report the same bogus information. The

network used for our simulations is still the same as presented above. Therefore, under this setting three different

aggregation levels are presented and attackable.

Figure 7 shows only a slight accuracy gain through the use of the FAIR inference module on the three different

aggregation levels. Furthermore, because of the collusion, an adversary compromising ANs at the same level can

fool the inference module which will always obtain a high level of consistency and completeness. As a result, in this

extreme setting, the QoI provides no remarkable information on the accuracy of the aggregation process. However, we

argue that cryptographical protocols addressing this setting will require a high number of messages exchange (and

thus they are not practicable for real-time traffic).



5.3 Independence from aggregation function

In the previous sections, we evaluated the behavior of the framework when the average is chosen as aggregation

function. This function has the nice collateral feature of mitigating errors by averaging multiple values. In the

following, we want to highlight that the resilience of our framework does not depend on the particular used aggregation

function. Therefore, we present some experimental evaluation on the framework with not-mitigating functions such

as the maximum computation (from now on max ).

The results have the same degree of accuracy and significance of the ones showed for the average aggregation.

ıve

Figure 8 shows the results for a na¨ attacker for max aggregation, while Figure 9 focuses on a smart attacker without

topology knowledge for max aggregation. Recall that all the adversaries report the 50% of the correct aggregate value.

knowledge. Finally, Figure 10 tests the situation where the adversary is also given information on the topology, such

that she can focus on attacking nodes at the same level, besides colluding to report the same bogus information. The

network used for our simulations is still the same as introduced. However, the aggregation function is the max. These

results impressively indicate that the FAIR framework provides in-network resilience independently of the actual

aggregation function.



5.4 Overhead

Finally, we want to show that the use of the fuzzy inference module (as the core component of our FAIR architecture)

to drive the aggregation process results in no relevant time or energy overhead. After deploying our code in the TinyOS

environment, we evaluate the energy and computation overhead by means of the Avrora Simulation Framework [5].

This tool is a set of simulation and analysis tools for programs written for the AVR microcontroller produced by

Atmel and the Mica2 sensor nodes. In particular, we test our code on the Mica2dot motes, which are smaller than

Mica2 ones and better suited for commercial deployment [2]. We remark that we evaluate only the FAIR inference

module, not the entire framework aggregation architecture.

The data transmission costs for the QoI (4 or 8 bits) do not cause any additional energy consumption for trans-

mission. The footprint, i.e. the occupied memory, is as low as 36KB. The simulations show that our FAIR inference

module employs an average amount of 4 · 10−4 Joule and about 13000 CPU cycles, which took an average time of

about 1ms for firing the rules. Note that this is even more efficient than a hash function computation. As a result, we





11

(a) (b) (c)



ıve

Fig. 8. Na¨ adversary for max aggregation









(a) (b) (c)



Fig. 9. Smart adversary for max aggregation without topology knowledge









(a) (b) (c)



Fig. 10. Smart adversary for max aggregation with topology knowledge









12

claim that the use of the fuzzy inference module to drive the aggregation process results in no remarkable overhead,

whereas it improves the resilience of the aggregation process without the use of any heavy cryptographical operation.





6 Related Work



Faulty data can originate from a sensor node or an malicious aggregator. As the adversary is always able to manipulate

the sensed environment such that a sensor will report faulty data, the aggregation function should limit the influence

of a single sensor. Such proposals can be found in e.g. [39,11]. Protection against malicious aggregation is guaranteed

by protocols such as [42,13]. The protocols first aggregates the information without ensuring integrity. In a second

step, the reader initiates the validation of the information. Thus, we argue that the protocols are not well suited for

unattended sensor networks where a reader device or base station is only temporarily present [15]. The protocols do

also not tolerate any failure, only recent proposals of Haghani et al. [23] and Taban et al. [38] bring in capabilities

to detect malicious nodes as a way to provide robustness. SecureDAV [28] uses threshold public key cryptography,

which we argue to be not suitable in a sensor network. Another way, that is followed by FAIR, is to base the integrity

on witness nodes [16]. Robustness can be provided by requiring only a threshold witnesses to answer.

Protocols that aim at robust aggregation do frequently not use any aggregation hierarchy but use directed flooding

to spread the messages [31,29,14]. This creates multiple aggregation paths and needs the aggregation function to be

invariant against double counting. The latter can be achieved by approximate algorithms using counting sketches [19].

Recent proposals also consider approximate aggregation that is secure against corrupted aggregator nodes [20].

The idea of embedding into data additional details, such as quality of information, is a common solution in many

problems, such as knowledge discovery or data mining. The same concept has been used in the context of data

aggregation. For instance, the well-known solution proposed in [32] aimed to “drive” the integration of heterogenous

databases by exploiting quality-information embedded in the data. Furthermore, in the context of data aggregation

in WSN, additional quality information has been used to enhance resilience of computation. In [9], nodes estimate

the quality they can deliver. In [26], authors proposed an aggregation framework that can detect and isolate spurious

measurements from computed aggregate values, by adding feedback information to the data.

The applicability of Fuzzy Control Systems to security problems has been first proposed in [25], where the

notion of failure was represented in terms of fuzzy sets and widely interpreted. Several application aspects of fuzzy

methodology in system failure engineering were discussed. Instead, authors in [35] introduced a mechanism to control

QoS in IP Networks using Fuzzy Logic. A Fuzzy Controller is implemented in network routers in order to evaluate

policy conditions. For instance, the Fuzzy Controller takes as input information such as link status, packet loss, delay,

jitter, and so on. It then generates the policy actions as Fuzzy outputs, i.e. values needed for traffic classifications,

traffic shaping, dropping, scheduling, and so on.

Fuzzy Controllers have also been used in the context of WSN. The work in [22] introduced a Fuzzy-based approach

for the problem of cluster-head election. The Fuzzy Controller is performed by a central control algorithm run at

the base station, which is assumed to have global knowledge over the network). As input, the controller takes three

Fuzzy values: energy, concentration, and centrality. The output is basically a probability for any single node to

become cluster-head. Finally, in [27], a Fuzzy-based mechanism to face selective forwarding attacks – malicious nodes

dropping sensitive packets – in WSN. In particular, it addresses this problem by letting nodes select multiple paths

in order to deliver data. However, the number of different paths must be optimized in order to achieve the best

compromise between security and efficiency. For this reasons, authors make use of a Fuzzy Controller running at the

base station, which takes as input the energy information and an estimation of the number of malicious nodes, and

gives as a fuzzy output the suggested number of paths.





7 Conclusions



We proposed the Fuzzy-based FAIR framework for resilient data aggregation in real-time responsive wireless sensor

networks supporting in-network processing. Our simulation results validate that a distributively computed QoI value

provides a remarkably good measure according to the accuracy of an aggregated value. Moreover, we show that the

use of the inference scheme improves the accuracy of the aggregation process. This is in particular true for na¨ ıve

attackers and to some degree also holds for smart attackers. We highlight that the FAIR framework suits for different

classes of aggregation functions like average and min/max functions. Finally, being stimulated by our promising

results based on Fuzzy Inference Schemes considering different classes of attackers, we advocate the security research

community to make broader use of such concepts.





13

References

1. JGraphT: A free Java graph library. http://jgrapht.sourceforge.net/.

2. Mica2Dot. Wireless Microsensor Mote. http://www.xbow.com/products/Product pdf files/Wireless pdf/MICA2DOT

Datasheet.pdf.

3. NesC: A Programming Language for Deeply Networked Systems. http://nescc.sourceforge.net/.

4. SensiLink: Middleware to bridge Wireless Sensor Networks with SCADA systems. http://www.meshnetics.com/

wsn-software/sensilink.

5. The AVR Simulation and Analysis Framework. http://compilers.cs.ucla.edu/avrora.

6. The FuzzyJ Toolkit for the Java Platform. http://www.iit.nrc.ca/IRpublic/fuzzy/fuzzyJToolkit2.html.

7. TinyOS: An open source OS for the networked sensor regime. http://www.tinyos.net/.

8. J.-M. Bohli, A. Hessler, O. Ugus, and D. Westhoff. A Secure and Resilient WSN Roadside Architecture for Intelligent

Transport Systems. In WiSec’08, pages 161–171, 2008.

9. M. Breza, R. Anthony, and J. McCann. Quality-of-context driven autonomicity. In EEDAS 2007, 2007.

10. L. Buttyan and P. Schaffer. PANEL: Position-based Aggregator Node Election in Wireless Sensor Networks. In MASS’07,

pages 1–9, 2007.

a

11. L. Butty´n, P. Schaffer, and I. Vajda. RANBAR: RANSAC-based Resilient Aggregation in Sensor Networks. In SASN’06,

pages 83–90, 2006.

12. R. Canetti, J. A. Garay, G. Itkis, D. Micciancio, M. Naor, and B. Pinkas. Multicast Security: A Taxonomy and Some

Efficient Constructions. In INFOCOM’99, pages 708–716, 1999.

13. H. Chan, A. Perrig, and D. Song. Secure Hierarchical In-Network Aggregation in Sensor Networks. In CCS’06, pages

278–287, 2006.

14. J.-Y. Chen, G. Pandurangan, and D. Xu. Robust Computation of Aggregates in Wireless Sensor Networks: Distributed

Randomized Algorithms and Analysis. IEEE Transactions on Parallel and Distributed Systems, 17(9):987–1000, 2006.

15. R. Di Pietro, L. Mancini, C. Soriente, A. Spognardi, and G. Tsudik. Catch Me (If You Can): Data Survival in Unattended

Sensor Networks. In PERCOM’08, pages 185–194, 2008.

16. W. Du, J. Deng, Y. S. Han, and P. Varshney. A Witness-Based Approach For Data Fusion Assurance In Wireless Sensor

Networks. In GLOBECOM’03, volume 3, pages 1435–1439, 2003.

17. D. Dubois and H. Prade. Fuzzy Sets and Systems. 1988.

18. C. Elkan. The paradoxical success of fuzzy logic. IEEE Expert, 9(4):3–8, 1994.

19. P. Flajolet and G. N. Martin. Probabilistic Counting Algorithms for Data Base Applications. Journal of Computer and

System Sciences, 31(2):182–209, 1985.

20. M. N. Garofalakis, J. M. Hellerstein, and P. Maniatis. Proof Sketches: Verifiable In-Network Aggregation. In ICDE’07,

pages 996–1005, 2007.

21. G. Gerla. Effectiveness and multivalued logics. J. Symbolic Logic, 71(1):137–162, 2006.

22. I. Gupta, D. Riordan, and S. Sampalli. Cluster-head election using Fuzzy Logic for Wireless Sensor Networks. pages

255–260, 2005.

23. P. Haghani, P. Papadimitratos, M. Poturalski, K. Aberer, and J.-P. Hubaux. Efficient and Robust Secure Aggregation for

Sensor Networks. In NPSec’07, 2007.

24. W. R. Heinzelman, A. Chandrakasan, and H. Balakrishnan. Energy-Efficient Communication Protocol for Wireless Mi-

crosensor Networks. In HICSS’00, page 8020, 2000.

25. C. Kai-Yuan. System failure engineering and fuzzy methodology An introductory overview. Fuzzy Sets and Systems,

83(2):113–133, 1996.

26. Y. Kotidis, V. Vassalos, A. Deligiannakis, V. Stoumpos, and A. Delis. Robust management of outliers in sensor network

aggregate queries. pages 17–24, 2007.

27. H. Lee and T. Cho. Fuzzy-Based Reliable Data Delivery for Countering Selective Forwarding in Sensor Networks. In 4th

International Conference on Ubiquitous Intelligence and Computing, volume 4611, pages 535–544, 2007.

28. A. Mahimkar and T. S. Rappaport. SecureDAV: A Secure Data Aggregation and Verification Protocol for Sensor Networks.

In GLOBECOM’04, volume 4, pages 217–2179, 2004.

29. A. Manjhi, S. Nath, and P. B. Gibbons. Tributaries and deltas: efficient and robust aggregation in sensor network streams.

In SIGMOD’05, pages 287–298, 2005.

30. J. E. Mortensen. JFS Fuzzy System. http://inet.uni2.dk/jemor/jfs.htm, 2004.

31. S. Nath, P. B. Gibbons, S. Seshan, and Z. R. Anderson. Synopsis diffusion for robust aggregation in sensor networks. In

SenSys’04, pages 250–262, 2004.

32. F. Naumann, U. Leser, and J. Freytag. Quality-driven Integration of Heterogeneous Information Systems. In 25th VLDB

Conference, 1999.

33. E. Osipov. tinyLUNAR: One-Byte Multihop Communications Through Hybrid Routing in Wireless Sensor Networks. In

NEW2AN, pages 379–392, 2007.

34. B. Przydatek, D. Song, and A. Perrig. SIA: Secure Information Aggregation in Sensor Networks. In SenSys’03, pages

255–265, 2003.





14

35. R. Resende, N. Nassif, M. de Siqueira, A. da Silva, and M. Lima-Marques. Quality of Service Control in IP Networks

Using Fuzzy Logic for Policy Condition Evaluation. In FUZZ’05, pages 448–453, 2005.

36. T. Ross. Fuzzy Logic with Engineering Applications. Wiley, 2004.

37. M. Sirivianos, D. Westhoff, F. Armknecht, and J. Girao. Non-Manipulable Aggregator Node Election Protocols for Wireless

Sensor Networks. In WiOpt’07, pages 1–10, 2007.

38. G. Taban and V. Gligor. Efficient Handling of Adversary Attacks in Aggregation Applications. In ESORICS’08, LNCS,

2008.

39. D. Wagner. Resilient Aggregation in Sensor Networks. In SASN ’04, pages 78–87, 2004.

40. L. Wang. Adaptive Fuzzy Systems and Control: Design and Stability analysis. Prentice-Hall, 1994.

41. L. Wang. A course in Fuzzy Systems and Control. Prentice-Hall, Inc., 1996.

42. Y. Yang, X. Wang, S. Zhu, and G. Cao. SDAP: A Secure Hop-by-Hop Data Aggregation Protocol for Sensor Networks.

In MobiHoc’06, pages 356–367, 2006.

43. L. Zadeh. Fuzzy Sets. Information and Control, 8:338–353, 1965.





A Appendix: Fuzzy Control Systems



Fuzzy Logic concerns a fundamental methodology to represent and process linguistic information, with mechanisms

to deal with uncertainty and imprecision [43]. It extends boolean logic to assign to each proposition a degree of truth

value in between 0 and 1. As a result, the so-called fuzzy sets are sets whose elements have degrees of membership.

In classical set theory, the membership of elements in a set is expressed by a binary value according to a bivalent

condition – an element either belongs or does not belong to the set. By contrast, fuzzy set theory allows the gradual

assessment of the membership of elements in a set; this is described with the aid of a membership function (denoted

by µ), valued in the real unit interval [0, 1]. Fuzzy sets generalize classical sets, since the indicator functions of

classical sets are special cases of the membership functions of fuzzy sets, if the latter only take values 0 or 1[17]. The

most common shape of membership functions is triangular, although trapezoidal and bell curves are also used. The

process of converting a set of real (crisp) numbers onto fuzzy membership values is called fuzzification. An example

of such process is given in Figure 11.









Fig. 11. Fuzzification







Fuzzy Control Systems (FCSs) are based on fuzzy logic. Input variables in a FCS define the language used to

discuss a fuzzy concept. The fuzzy variables are characterized by a name (e.g., temperature), a universe of discourse

(e.g., [0 − 100]), the units of the variable (e.g., centigrade), and a set of primary fuzzy terms (e.g., hot, cold, warm).

Terms can also have modifiers (e.g., very) and operators (e.g., and). As for standard control systems, we distinguish

between input and output variables. In FCSs, each fuzzy variable will have an associated fuzzy membership value.

The key element of a FCS is the set of fuzzy rules. These govern and assess the behavior of the system, generating

a fuzzy output starting from the fuzzy input values. Fuzzy rules are sets of statements in the form of if antecedent

then conclusion. Every rule is evaluated in parallel using fuzzy reasoning. All the rules that apply are invoked,

using the membership functions and truth values obtained from the inputs, to determine the result of the rule. There

are several methods to determine the result. However, the most common one is the max-min inference scheme [41], in

which the output membership function is given the truth value generated by the premise (min). Then, the results of

all the rules that have fired are combined, by performing a fuzzy union of the membership values (max). This result

in turn needs to be converted into a concrete (crisp) value, through a procedure known as defuzzification. There are





15

Fig. 12. Max-min Inference Scheme







several different methods to defuzzify a value, each with various advantages and drawbacks. Among the others, the

center of gravity method is very popular: the center of mass of the result provides the crisp value, and it is computed

by the following formula:

µi (x) · x dx

COG = (1)

µi (x) dx



where µi (x) is the aggregated membership function and x the output fuzzy variable. Another approach is the weighted

average method, which finds the weighted average of the x values of the points that define a fuzzy set using the

membership values of the points as the weights, according to the following formula:

n n

mi wi mi

W AVG =

i=1

/ i=1

(2)





where mi is the membership function of each rule, and wi is the weight associated with each rule. This method is

computationally faster and easier, and gives fairly accurate result [36]. An example of a FCS using the max-min

inference scheme is given in Figure 12.









16



Other docs by yurtgc548
项目概述
Views: 0  |  Downloads: 0
雅比斯的禱告The Prayer of Jabez
Views: 0  |  Downloads: 0
無投影片標題
Views: 1  |  Downloads: 0
温故校园
Views: 0  |  Downloads: 0
没有幻灯片标题
Views: 0  |  Downloads: 0
氫能源
Views: 0  |  Downloads: 0
By registering with docstoc.com you agree to our
privacy policy

You are almost ready to download!

You are almost ready to download!